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^"■^ ' Recommender systems are crucial tools to overcome the information overload brought about by 

the Internet. Rigorous tests are needed to establish to what extent sophisticated methods can 
O | improve the quality of the predictions. Here we analyse a refined correlation-based collaborative 

filtering algorithm and compare it with a novel spectral method for recommending. We test 
' them on two databases that bear different statistical properties (MovieLens and Jester) without 

filtering out the less active users and ordering the opinions in time, whenever possible. We find 
that, when the distribution of user-user correlations is narrow, simple averages work nearly as 
well as advanced methods. Recommender systems can, on the other hand, exploit a great deal of 
additional information in systems where external influence is negligible and peoples' tastes emerge 
entirely. These findings are validated by simulations with artificially generated data. 
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\D . 1. INTRODUCTION 

in 

■ One of the most amazing trends of today's globalized economy is peer produc- 
tion [Anderson 2006] . An unprecedented mass of unpaid workers is contributing to 
the growth of the World Wide Web: some build entire pages, some only drop ca- 

[ — \ sual comments, having no other reward than reputation [Masum and Zhang 2004]. 

Many successful websites (e.g. Blogger and MySpace) are just platforms holding 
user-generated content. The information thus conveyed is particularly valuable be- 
cause it contains personal opinions, with no specific corporate interest. It is, at the 
r> | same time, very hard to go through it and judge its degree of reliability. If you want 

■ to use it, you need to filter this information, select what is relevant and aggregate 
it; you need to reduce the information overload [Maes 1994]. 

As a matter of fact, opinion filtering has become rather common on the web. 
There exist search engines (e.g. Googlenews) that are able to extract news from 
journals, websites (e.g. Digg) that harvest them from blogs, platforms (e.g. Epin- 
ions) that collect and aggregate votes on products. The basic version of these 
systems ranks the objects once for all, assuming they have an intrinsic value, in- 
dependent of the personal taste of the demander [Laureti et al. 2006]. They lack 
personalisation [Kelleher 2006], which constitutes the new frontier of online services. 

Users need only browse the web in order to leave recorded traces, the eventual 
comments they drop add on to it. The more information you release, the better 
the service you receive. Personal information can, in fact, be exploited by recom- 
mender systems. The deal becomes, at the same time, beneficial to the community, 
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as every piece of information can potentially improve the filtering procedures. Ama- 
zon.com, for instance, uses one's purchase history to provide individual suggestions. 
If you have bought a physics book, Amazon recommends you other physics books: 
this is called item-based recommendation [Breese et al. 1998; Sarwar et al. 2001]. 
Those who have experience with it know that this system works fairly well, but 
it is conservative as it rarely dares suggesting books regarding subjects you have 
never explored. We believe a good recommender system should sometimes help 
uncovering people's hidden wants [Maslov and Zhang 2001]. 

Collaborative filtering is currently the most successful implementation of recom- 
mendation systems. It essentially consists in recommending you items that users, 
whose tastes are similar to yours, have liked. In order to do that, one needs col- 
lecting taste information from many users and define a measure of similarity. The 
easiest and most common ways to do it is to use either correlations or Euclidean 
distances. 

Here we test a correlation-based algorithm and a spectral method to make pre- 
dictions. We describe these two families of recommender systems in section 2, and 
propose some improvements to currently used algorithms. In section 3 we present 
the results of our predicting methods on the MovieLens and Jester data sets, as 
well as on artificial data. We argue that the distribution of correlations in the sys- 
tem is the key ingredient to state whether or not sophisticated recommendations 
outperform simple averages. Finally we draw some conclusions in section 4. 

2. METHODS 

Our aim is here to test two methods for recommending, spectral and correlation- 
based, on different data sets. The starting point is data collection. One typically 
has a system of N users, M items and n evaluations. Opinions, books, restaurants 
or any other object can be treated, although we shall examine in detail two funda- 
mentally different examples: movies and jokes. Each user i evaluates a pool of 
items and each item a receives n a evaluations, with n = J2iLi n i = J2a=i n «- The 
votes Vi a can be gathered in a matrix V. If a user j has not voted on item /J, the 
corresponding matrix element takes a constant value Vj@ = EMPTY, usually set to 
zero. 

Once the data collected into the voting matrix, we aim to predicting votes before 
they are expressed. That is, we would like to predict if agent j would appreciate 
the movie, book or food /3, before she actually watched, read or ate it. Say, we 
predict that user j would give a very high vote to item (3 if she were exposed to it; 
we can then recommend /3 to j and verify a posteriori her appreciation. Ideally, we 
would like to have a prediction for every EMPTY element of V. 

Most websites only allow votes to be chosen from a finite set of values. In order to 
take into account the fact that each person adopts an individual scale, we compute 
each user's i average expressed vote (vi) and subtract it from non empty v, a 's. The 
methods we analyze give predictions in the following form [Delgado 1999]: 



N 




(1) 



where v'^ is the predicted vote and S is a similarity matrix. The choice of S is the 
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crucial issue of collaborative filtering. One has, in fact, very often to face a lack of 
data, which makes it difficult to estimate the similarity between non overlapping 
users. We shall describe, in the following, correlation-based and spectral techniques 
to cope with this problem. 

2.1 Correlation 

Correlation-based methods for recommending make use of user-user linear correla- 
tions as similarity measures. If we call (vi) the average vote expressed by user i, 
the correlations CV, can be calculated with Pearson's formula [Press et al. 1992]: 

13 VE a (".« - (".))VE«(v - (^)) 2 ' 

with Cij — if users i and j haven't judged more than one item in common. 
Unexpressed votes can then be forecast by setting Sij oc dj in eq. (1). This 
estimation is often used as a rule of thumb, without knowing if it is justified and 
why. There is, at our knowledge, only one model [Bagnoli ct al. 2003] where the 
convergence of v'^ to the real vote, in the limit of an infinite system, is guaranteed. 

The use of pair correlations alone is often not very effective in predicting tastes. 
In fact, Cij is a measure of similarity between the behavior of two users who have 
expressed votes on a number mj of commonly evaluated items. When the matrix V 
is very sparse and riy , as a consequence, small or zero for many couples of users, such 
a measure becomes poorly informative. A popular solution to this problem [Blattner 
et al. 2007] is to estimate unknown votes via a linear combination of v'jp and a global 
average, i.e. v"p oc qv'jp + (1 — q)m((3), where q is a constant weight between and 
1 and m((3) the average vote expressed on item (3 by all users. The typical choice, 
q = 1/2, amounts to defining = l/np + CV, in cq. (1). 

After testing many different versions of correlation-based recommendations, with 
the same number of free parameters, we chose a more effective ansatz: every time 
dj = we replace it with the average value of the correlation across the population. 
Such a mean-field solution improves the results and eliminates, at the same time, 
the parameter q. As for the normalization of the weighted sum in eq. (1), we found 
that the best choice is Sji — Cji/ ^ \Cji\. Small adjustments can be made on 
the minimal number of common items n c required to compute correlations. A low 
value of n c may, in fact, improve the average prediction quality at the price of very 
large fluctuations. We set n c = 3 in most cases. 

Let us stress here that the method we used can be further improved by calibrating 
additional parameters. Case amplification [Breese et al. 1998], for instance, consists 
in taking a similarity measure proportional to some power 7 of the correlation, 
in order to punish low correlation values. Upon optimizing the value of 7, the 
prediction power of our correlation-based method is able to compete with that of 
spectral techniques [Moret 2007]. 

2.2 Spectral 

An alternative approach to recommending makes use of spectral techniques [Sar- 
war et al. 2000; Billsus and Pazzani 1998]. Users can be represented by their 
vectors of votes Vi in a M-dimensional metric space M . One can define, between 
all couples of users an overlap fiy as a decreasing function of their distance. 
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Users can thus be represented as nodes of a weighted graph. Spectral methods 
have recently been used to detect clusters in networks [Seary and Richards 2003; 
Donetti and Munoz 2004; Capocci et al. 2005] and can equally be applied to de- 
vise recommender systems. In order to implement them for this purpose, we take 
the following fundamental steps: i) Calculate the overlap matrix ft that defines 
our weighted graph, ii) Find the spectrum of its Laplacian after dimensionality 
reduction. Hi) Calculate user similarities using eigenvectors' elements, and make 
predictions with eq. (1). The method is not trivial and each one of the preceding 
points contains subtleties. After testing many different options, we have used the 
one that yields the best and more stable results. 

The definitions and the procedures we shall describe here, stand on the following 
assumption: if we disposed of the votes of all users on all items, the points of coor- 
dinates Hi would only occupy a compact subspace of M. , a manifold- like structure 
of dimension k « M. This approach, whose validity will be verified a posteriori, 
is inspired by ref. [Belkin and Niyogi 2003], where the interested reader can find 
the details. 

A preliminary step is the substitution of the EMPTY entries of the voting matrix 
with the corresponding object's average received vote m(a). We then define the 
elements of the overlap matrix as Ojj = cxp(— d 2 j/T ), where dij — \\vi — Vj\\ 
is the Euclidian distance between the vectors of votes of users i and j. Q gives 
higher weights to pairs of users whose votes are closer, and reaches its maximum 
Clij = 1 when the distance is 0. The external parameter T 2 controls the size of the 
neighborhood, since fiy — > for » T. The performance of our experiments was 
almost unchanged within a wide range of this parameter. We fixed F — maxjj dij , 
which allows to develop the exponential exp(— d 2 /T 2 ) ~ 1 — (d/T) 2 . Hence we set 

nij = i-( "V'l, , 

\maxjj \\Vi -Vj\\ 

In order to obtain the optimal embedding in k dimensions of the structure formed 
by users in the space of the votes, ref. [Belkin and Niyogi 2003] prescribes to find 
the first k eigenvectors of the normalized Laplacian L — D^ 1 ^ 2 (D — f^D^ 1 / 2 of 
17, where D is its diagonal weight matrix, of elements Da — J^j When used 
for partitioning the population of users, the spectrum of the normalized Laplacian 
L has revealed more effective and stable than that of other matrices [von Luxburg 
et al. 2004]. Let y , , yu-i be its first k eigenvectors, in order of increasing eigen- 
values. Note that y always has eigenvalue 0. The other non-trivial eigenvectors 
contain information about eventual subgraphs [Seary and Richards 2003]. If the 
graph is connectected and bipartite, for instance, the components of y\ will be 
positive for one subgraph and negative for the other. Whenever the two subgraphs 
are not very well separated, the distinction between the two components becomes 
progressively fuzzier. An example is given by the left graph of fig. 1, where the 
components yi{j),j — 1, 2, N of the first non-trivial eigenvector of the Laplacian 
of the Jester database show a clear discontinuity. Higher eigenvectors contribute to 
define the first two clusters and can reveal the contours of more weakly connected 
blocks [Simonsen et al. 2003]. The right graph of fig. 1, in fact, shows the projec- 
tion of the Jester matrix on the first two non-trivial eigenvectors. The presence of 
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Fig. 1. Identification of clusters of users in Jester with spectral analysis. Left graph: lOxvalue 
of the elements of the first non-trivial eigenvector y\ of the normalized Laplacian matrix, plotted 
against their cardinality. The clear discontinuity around indicates the presence of two big 
clusters. Right graph: The first non-trivial eigenvector is plotted here against the second. The 
presence of the two main clusters is clearly visible. 

two islands can be detected by eye inspection, confirming that Jester users can be 
grouped in, at least, two different categories. 

Let us now come back to the application of our spectral analysis to recom- 
mending systems. Each user i can be represented by a point in the (k — 1)- 
dimensional subspace made of the ith components of the first k eigenvectors, i.e. 
y(i) = (yi(i), yk-i{i))- One can think that each one of these coordinates contains 
information about the degree of participation of user i in a subgroup of users. The 
parameter k, which plays the key role in dimensionality reduction, has to be deter- 
mined by a cross checking procedure: we measure the performance of the algorithm 
on the training set for different values of k, and choose the one which supplies the 
best results. In our experiments this is of order 10. 

Finally, we calculate the similarity Sij for each pair of users. After comparing 
many different measures, the cosine turned out to work best. Thus we define 



where the superscript T denotes the transposed of a vector, and 1 1 • 1 1 is the ordinary 
L2 norm. Here Sij can be interpreted as an overlap between the participation ratio 
of two users to different groups of taste. Armed with this similarity matrix, we 
predict votes according to eq. (1). Our spectral technique, although very tedious, 
performs better than the other methods we tested. 

3. EXPERIMENTS 

The purpose of this paper is to evaluate different collaborative filtering algorithms, 
and to establish when they can be used effectively. To this end, we have tested the 
methods described above on two data sets, carrying completely different features: 
MovicLcns and Jester. In order to achieve a better understanding of the role played 
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Fig. 2. Distribution of votes for Movielens and Jester. 



by correlations in the votes, we have also made simulations on artificially generated 
data. Prior to presenting our results, we shall describe the data sets used for the 
experiment. 

3.1 Data sets 

MovieLens (movielens.umn.edu) is a webservice of the GroupLens project (grou- 
plens.org), that recommends movies. Users ratings are recorded on a five stars scale 
and contain additional information, such as the time at which an evaluation has 
been made. The data set we downloaded contains 6040 users x 3952 movies, where 
only a fraction t]m = 0.041 of all possible votes has actually been expressed. Jester 
(shadow. ieor.berkeley.edu/humor)is an online joke recommendation system. It con- 
tains 73421 users x 100 jokes and a fraction rjj = 0.55 of expressed votes. Users 
ratings are real numbers ranging from —10 to 10. 

While both websites collect ratings of users on items, they differ substantially 
in many respects: the range of allowed ratings R, the users-to-items ratio N/M, 
the sparsity of the voting matrix and the distribution of votes, among others. In 
fact, while the MovieLens data set is roughly symmetric, the Jester one is heavily 
asymmetric, with users outnumbering items by a factor 734. This is because of the 
fixed, low number of jokes (Mj = 100) one can evaluate in Jester. For the same 
reason, the MoviLens data set is much sparser than Jester. In fact, defining the 
sparsity coefficient as r\ — n/(M x N), where n is the number of recorded ratings, 
one has r]M — 4% and rjj ~ 55%. Note that r] decreases as the matrix gets sparser, 
and not vice versa. 

The most fundamental difference, though, is the amount of a priori information 
provided to users. People choose the movies they want to watch on the basis of a 
preliminary selection. They know actors and directors, read reviews and are ex- 
posed to advertisements. When they actually buy the entrance ticket, they have 
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some motivated expectactions. Accordingly, in the MovieLens data set, the dis- 
tribution of all the issued votes is unimodal and noticeably shifted towards the 
positive region (see fig. 2, right panel). No preselection, on the other hand, is pos- 
sible with online jokes, giving rise to a more uniform distribution of votes. This 
can be verified by arranging the votes on a 5 beans histogram and computing the 
entropy S = — ^2iPi\og 5 pi, which yields Sj — 0.98 for Jester and Sm = 0.90 for 
MovieLens. In addition to that, the distribution of votes is slightly bimodal in 
Jester, as shown in the left panel of fig. 2. This suggests the existence of groups 
of users with similar taste, which is confirmed by fig. 1, as already pointed out. In 
order to gain more insight into this fundamental difference, we also report, in fig. 5, 
the distribution of user-user correlations, the effects of which will be discussed in 
section 3.4. 

The size of the data sets has been reduced by roughly 50% in both dimensions. 
As the cancellations have been done randomly, the statistical properties of the orig- 
inal data have been preserved. In particular, we tried to maintain the probability 
distribution of the number of votes per users, as well as the sparsity and the N/M 
ratio. We want to stress that this is crucial when testing the performance of pre- 
dictive algorithms on real data in an objective way. In fact, many experiments 
can be found in the literature that only test recommender systems on dense voting 
matrices. Typically, users who have judged too few items are struck out, as well as 
items that have received too few votes. We did not comply to such an habit and 
made an effort to keep the filtering level as low as possible, although this makes 
predictions much more difficult. 

Once filtered, the data are divided into a training and a test set. The training 
set is composed of the data one actually uses to make predictions on the missing 
evaluations contained in the test set. This last is only employed afterwards, to 
compare predictions and realised evaluations. We have chosen test sets of dimension 
ntest = 10 4 for both databases. The experiments have been carried out as follows. 
First, we fix the test set and never change it through the simulation. Then we 
progressively fill the training set over time and make predictions on the entire test 
set at fixed time steps. 

Many different accuracy metrics have been proposed to assess the quality of 
recommendations (see ref. [Herlocker et al. 2004]), one of the most common of 
which being the Mean Average Error: 



where the sum runs over all expressed votes in the test set and R = \v max — v m i n \ 
is the size of the domain of all possible ratings. In our experiments, the MAE is 
calculated, at different sparsity values rj, on a unique test set. The results for our 
sets of data will be presented in the following sections. 

3.2 MovieLens 

After the filtering procedure, we cast the data set in a voting matrix Vm{N x M), 
with N — 3020 and M = 1976. As previously mentioned, the MovieLens database 
contains the time at which evaluations have been made. We have sorted the votes 




(3) 
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Fig. 3. MAE of spectral (with k = 20), correlation-based methods and movie average m(a), as a 
function of for the Movielens reduced data set Vm- The sparsity coefficient r]M is increased 
according to the timestamps of the expressed votes. The test set is composed by the last 10 4 
evaluations. 



according to their relative timestamp, both in the training set Vm and in the test 
set, which is composed of the last 10 4 expressed votes. Such a choice is intended to 
reproduce real application tasks, where one aims to predict future votes -which is, 
of course, much harder than predicting randomly picked evaluations. It is somewhat 
less realistic to fix the test set once for all, but this has the advantage to allow for 
more objective comparisons of the results. 

The training set has been filled, as well, by adding one vote at a time, according 
to the temporal ordering. Predictions have been made at fixed sparsity values, as 
shown in fig. 3. The MAE obtained with spectral and correlation-based methods 
are compared therein. The solid line is the MAE of predictions made by taking 
the average vote received by each movie m(a). Surprisingly, the results achieved 
with this naive estimator are comparable to those of the sophisticated methods, 
and outperform them in the very sparse region. Note that the movie average 
predicts the same vote for every user, while the other methods produce personalized 
recommendations. Their utility only emerges after a crossover value t)m — 0.5. For 
most fillings, our spectral method (diamonds) performs slightly better than our 
correlation method (circles), which also suffers of stronger fluctuations. 

3.3 Jester 

After reduction, we are left with a data set that can be cast in a voting matrix 
Vj(N x M), with N = 3671 and M = 100. The test set has been fixed once for 
all by random choice of 10 4 evaluations. The training set has been filled randomly 
and predictions have been produced at increasing 77 values. The results are shown 
in fig. 4. While in the MovieLens data set the item's average received vote m(a) is 
a good predictor for all users, here it is not at all the case. A much better estimate 
can be produced by the average vote a user has given to any item, represented by 
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Fig. 4. MAE of spectral (with k = 8), correlation- based methods and user average (vi), as a 
function of the sparsity, for the Jester data set Vj. Here rjj is increased by random addition of 
votes. The test set contains 10 4 randomly picked evaluations. 

the straight line in fig. 4. In fact, in this case, votes are given on an individual 
basis: both the absolute opinions and the rating scales differ severely from user to 
user. This might partly explain the fact that sophisticated methods enjoy an edge 
over simple averages. Our spectral method (squares) performs much better than 
our correlation method (circles), which, in turn, beats the user average (uj) by a 
large amount. 

All methods give rise to a smaller error in the Jester than in the MovieLens 
set. This is due to various factors. First, Vm is sparser than Vj. Comparing the 
predictions at the same level of sparsity, though, Jester remains more predictable 
(with our spectral method), in spite of the much smaller size of its data set. A 
second factor is represented by the choice of the test set, which is not random in 
the MovieLens case. Finally, the average correlation of Jester users is higher than 
that of MovieLens. A more detailed explanation needs an analysis of the entire 
distribution of correlations, which is the object of the next section. 

3.4 Simulations 

The performance of any recommender system depends on the structure of the data 
set under investigation. Upon assuming that the user-user correlation is the relevant 
variable, the shape of its distribution can give us some preliminary information. 
Since correlations are a measure of similarity of people's tastes, it is trivial to 
understand that, when all users are equally correlated, the item's average received 
vote is the best predictor. When the distribution of correlations is broad, on the 
other hand, it becomes useful to make individual predictions, giving more weight to 
highly correlated mates. For simplicity, the analysis can be restricted to the mean 
and the standard deviation of the distribution of correlations. A higher absolute 
mean increases the predictability; a broader distribution enriches the information 
encoded and requires clever methods to be exploited. 
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Fig. 5. Distribution of correlations for Moviclcns and Jester. 



As a preliminary check of our analysis, we can look at the user-user correlations, 
as calculated from eq. (2), of our two data sets. It is evident from fig. 5 that 
Jester has a higher mean correlation (pij ~ 0.1) than MovieLens (fiM — 0.02), in 
accordance with the fact that Jester allows for better predictions. It also appears 
that Movielens correlations have a lower standard deviation (oj ~ 0.16 vs. <tm — 
0.05). This explains, in our framework, why sophysticated methods give much 
better results than simple averages in Jester (see fig. 4) and not in the case of 
MovieLens (see fig. 3). 

To test our hypothesis in a systematic way, we generate artificial votes, where 
we control the structure of the correlation distribution, according to the following 
procedure. First, we create a valid correlation matrix C of fixed size N x N, 
with the desired mean \x and variance cr 2 , as explained in appendix A. Then we 
draw a multivariate Gaussian distribution of votes V(N x M), with C as input 
correlation matrix. Finally, we perform predictions on these artificially generated 
data, comparing our correlation-based method with simple averages. 

In fig. 6 we plot, for two different values of /j,, the MAE of the predictions as a 
function of a. In the main graph, open and filled symbols are simulations performed 
with /j, = and n = 0.1 respectively. As expected, the performance of user averages 
(vi), represented by triangles, does not depend on the parameters of the distribution 
of correlations -as long as the distribution of votes is unimodal. On the contrary the 
object average m(a), represented by squares, obviously improves when \x increases, 
but it is still independent of a. The correlation method also works better for 
/i = 0.1 (filled circles) than for = (empty circles). Its MAE diminishes as 
well for increasing a. When a goes to zero, in fact, all pair correlations are equal 
Cij = c \fi, j and the correlation- weighted sums in eq. (1) become proportional to 
m(a). This appears clearly in fig. 6, where circles and squares tend to overlap for 
a — > 0. A similar situation occurs in the MovieLens database, where both \x and 
a are very small. It is not surprising that, in the sparse region of fig. 3, the mean 
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Fig. 6. Main graph: Mean prediction error of three recommending methods (user average, item 
average, correlation-based), as a function of the standard deviation cr of user- user correlations. 
The dimension of the data set, generated artificially as in section 3.4, is TV = 250, M = 500. 
The correlation matrix is distributed uniformly, with mean fi = (empty symbols) and fj, = 0.1 
(filled symbols). The lines are only meant to guide the eye. Inset: Same as the main graph, with 
fj, = 0.1, but here the votes follow a bimodal distribution. As a consequence, the user average (vi) 
works better than the item average m(ce). 



becomes an even better predictor. 

Votes in the Jester database can be better predicted by user averages (vi) than 
by item averages m(a). The reason for this is that users are grouped, as shown in 
fig. 1. The distribution of Jester votes is, as appears in fig. (2), slightly bimodal. 
We have generated a data set with this additional feature and plotted the result in 
the inset of fig. 6. We obtain a behaviour that is similar to that of fig. 4, confirming 
our hypothesis. 

4. CONCLUSIONS 

We have introduced a new method for recommending, based on the spectrum of the 
normalized Laplacian of a weighted graph in the user space. This has been tested 
on the MovieLens and Jester databases, together with a refined correlation-based 
method. 

The experiments have been made on raw data, without altering the statistical 
properties of the original voting matrix. In particular, no restriction has been 
imposed on the minimal number of votes expressed by users or received by items. 
For Movielens, the opinions have been ordered according to their timestamp. 

Our spectral method proves to be the most effective in all cases considered. 
The predictive power of recommender systems is stronger in the Jester than in 
the MovieLens case, where simple averages are able to detect most information 
contained in the data. We argue that this phenomenon is due, at least in part, to a 
different distribution of user-user correlations. When the latter is broader, in fact, 
sophisticated methods are much more rewarding. A distinction between unimodal 
and bimodal distributions of votes has been made to determine the best way to 
take simple averages. 

In conclusion, our findings can be used to determine whether or not it is worth 
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to develop complex methods for recommending in specific contexts. 
A. HOW TO DRAW A VALID CORRELATION MATRIX 

Here we explain the procedure, inspired by ref. [Jaeckel and Rebonato 2000], to 
create the correlation matrix C used to draw a multivariate Gaussian distribution 
of votes. The pair correlation of the population of users must follow a given dis- 
tribution with the desired mean /i and variance a. Moreover, the matrix we are 
looking for must be positive semi-definite, i.e. Aj > Vi, where the A,'s are the 
eigenvalues of C. Let us construct it step by step. 

I) We create a square matrix A{N x N), with elements drawn from a given 
distribution (uniform in our simulations) of mean zz and variance a. A will not 
be symmetric in general. II) We apply the transformation A = U + U T , where U 
is the upper triangular matrix of A and U T its transposed. A is now symmetric, 
but not positive semi-definite. Ill) We calculate the right eigensystem E of the 
real symmetric matrix A and its associated set of eigenvalues {A^}. Hence A ■ C = 
A ■ E, with A = diag(Xi). Some eigenvalues can be netative. IV) Let us define 
A^ = Aj VAi > 0, \ i — otherwise. The diagonal matrix A' has then semi-positive 

' r ' i _1 

elements \ i . V) Given the scaling matrix Tj = ^2 m sf m X m and the matrices 

B' = SVA 7 and B = VT-EVA 7 , the matrix C = BB T is positive-semidefinite 
and has unit diagonal elements by contstruction, but not the desired mean and 
variance we imposed on the original matrix A. VI) By adding a constant value to 
the elements of C, we adjust its mean to /x and its standard deviation to a. We 
rename the matrix thus obtained A and iterate the algorithm, from step III, till 
the good values of \x and a are obtained for C with the desired precision. 
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